NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Protein‐Ligand Structure and Affinity Prediction in CASP16 Using a Geometric Deep Learning Ensemble and Flow Matching

https://doi.org/10.1002/prot.26827

Morehead, Alex; Liu, Jian; Neupane, Pawan; Giri, Nabin; Cheng, Jianlin (April 2025, Proteins: Structure, Function, and Bioinformatics)

ABSTRACT Predicting the structure of ligands bound to proteins is a foundational problem in modern biotechnology and drug discovery, yet little is known about how to combine the predictions of protein‐ligand structure (poses) produced by the latest deep learning methods to identify the best poses and how to accurately estimate the binding affinity between a protein target and a list of ligand candidates. Further, a blind benchmarking and assessment of protein‐ligand structure and binding affinity prediction is necessary to ensure it generalizes well to new settings. Towards this end, we introduceMULTICOM_ligand, a deep learning‐based protein‐ligand structure and binding affinity prediction ensemble featuring structural consensus ranking for unsupervised pose ranking and a new deep generative flow matching model for joint structure and binding affinity prediction. Notably,MULTICOM_ligand ranked among the top‐5 ligand prediction methods in both protein‐ligand structure prediction and binding affinity prediction in the 16th Critical Assessment of Techniques for Structure Prediction (CASP16), demonstrating its efficacy and utility for real‐world drug discovery efforts. The source code for MULTICOM_ligand is freely available on GitHub.
more » « less
Free, publicly-accessible full text available April 8, 2026
Deep learning for reconstructing protein structures from cryo-EM density maps: Recent advances and future directions

https://doi.org/10.1016/j.sbi.2023.102536

Giri, Nabin; Roy, Raj S.; Cheng, Jianlin (April 2023, Current Opinion in Structural Biology)

Full Text Available
DRLComplex: Reconstruction of Protein Quaternary Structures Using Deep Reinforcement Learning

Soltanikazemi, Elham; Roy, Raj; Quadir, Farhan; Giri, Nabin; Morehead, Alex; and Cheng, JIanlin (July 2023, The International Conference on Intelligent Biology and Medicine (ICIBM))
Improving Protein–Ligand Interaction Modeling with cryo-EM Data, Templates, and Deep Learning in 2021 Ligand Model Challenge

https://doi.org/10.3390/biom13010132

Giri, Nabin; Cheng, Jianlin (January 2023, Biomolecules)

Elucidating protein–ligand interaction is crucial for studying the function of proteins and compounds in an organism and critical for drug discovery and design. The problem of protein–ligand interaction is traditionally tackled by molecular docking and simulation, which is based on physical forces and statistical potentials and cannot effectively leverage cryo-EM data and existing protein structural information in the protein–ligand modeling process. In this work, we developed a deep learning bioinformatics pipeline (DeepProLigand) to predict protein–ligand interactions from cryo-EM density maps of proteins and ligands. DeepProLigand first uses a deep learning method to predict the structure of proteins from cryo-EM maps, which is averaged with a reference (template) structure of the proteins to produce a combined structure to add ligands. The ligands are then identified and added into the structure to generate a protein–ligand complex structure, which is further refined. The method based on the deep learning prediction and template-based modeling was blindly tested in the 2021 EMDataResource Ligand Challenge and was ranked first in fitting ligands to cryo-EM density maps. These results demonstrate that the deep learning bioinformatics approach is a promising direction for modeling protein–ligand interactions on cryo-EM data using prior structural information.
more » « less
Full Text Available
Combining pairwise structural similarity and deep learning interface contact prediction to estimate protein complex model accuracy in CASP15

https://doi.org/10.1002/prot.26542

Roy, Raj_S; Liu, Jian; Giri, Nabin; Guo, Zhiye; Cheng, Jianlin (June 2023, Proteins: Structure, Function, and Bioinformatics)

Abstract Estimating the accuracy of quaternary structural models of protein complexes and assemblies (EMA) is important for predicting quaternary structures and applying them to studying protein function and interaction. The pairwise similarity between structural models is proven useful for estimating the quality of proteintertiarystructural models, but it has been rarely applied to predicting the quality ofquaternarystructural models. Moreover, the pairwise similarity approach often fails when many structural models are of low quality and similar to each other. To address the gap, we developed a hybrid method (MULTICOM_qa) combining a pairwise similarity score (PSS) and an interface contact probability score (ICPS) based on the deep learning inter‐chain contact prediction for estimating protein complex model accuracy. It blindly participated in the 15th Critical Assessment of Techniques for Protein Structure Prediction (CASP15) in 2022 and performed very well in estimating the global structure accuracy of assembly models. The average per‐target correlation coefficient between the model quality scores predicted by MULTICOM_qa and the true quality scores of the models of CASP15 assembly targets is 0.66. The average per‐target ranking loss in using the predicted quality scores to rank the models is 0.14. It was able to select good models for most targets. Moreover, several key factors (i.e., target difficulty, model sampling difficulty, skewness of model quality, and similarity between good/bad models) for EMA are identified and analyzed. The results demonstrate that combining the multi‐model method (PSS) with the complementary single‐model method (ICPS) is a promising approach to EMA.
more » « less
High-Performance Deep Learning Toolbox for Genome-Scale Prediction of Protein Structure and Function

https://doi.org/10.1109/MLHPC54614.2021.00010

Gao, Mu; Lund-Andersen, Peik; Morehead, Alex; Mahmud, Sajid; Chen, Chen; Chen, Xiao; Giri, Nabin; Roy, Raj S.; Quadir, Farhan; Effler, T. Chad; et al (November 2021, IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC))

Full Text Available
Outcomes of the EMDataResource cryo-EM Ligand Modeling Challenge

https://doi.org/10.1038/s41592-024-02321-7

Lawson, Catherine L; Kryshtafovych, Andriy; Pintilie, Grigore D; Burley, Stephen K; Černý, Jiří; Chen, Vincent B; Emsley, Paul; Gobbi, Alberto; Joachimiak, Andrzej; Noreng, Sigrid; et al (July 2024, Nature Methods)

Full Text Available
Impact of AlphaFold on structure prediction of protein complexes: The CASP15‐CAPRI experiment

https://doi.org/10.1002/prot.26609

Lensink, Marc_F; Brysbaert, Guillaume; Raouraoua, Nessim; Bates, Paul_A; Giulini, Marco; Honorato, Rodrigo_V; van_Noort, Charlotte; Teixeira, Joao_M_C; Bonvin, Alexandre_M_J_J; Kong, Ren; et al (October 2023, Proteins: Structure, Function, and Bioinformatics)

Abstract We present the results for CAPRI Round 54, the 5th joint CASP‐CAPRI protein assembly prediction challenge. The Round offered 37 targets, including 14 homodimers, 3 homo‐trimers, 13 heterodimers including 3 antibody–antigen complexes, and 7 large assemblies. On average ~70 CASP and CAPRI predictor groups, including more than 20 automatics servers, submitted models for each target. A total of 21 941 models submitted by these groups and by 15 CAPRI scorer groups were evaluated using the CAPRI model quality measures and the DockQ score consolidating these measures. The prediction performance was quantified by a weighted score based on the number of models of acceptable quality or higher submitted by each group among their five best models. Results show substantial progress achieved across a significant fraction of the 60+ participating groups. High‐quality models were produced for about 40% of the targets compared to 8% two years earlier. This remarkable improvement is due to the wide use of the AlphaFold2 and AlphaFold2‐Multimer software and the confidence metrics they provide. Notably, expanded sampling of candidate solutions by manipulating these deep learning inference engines, enriching multiple sequence alignments, or integration of advanced modeling tools, enabled top performing groups to exceed the performance of a standard AlphaFold2‐Multimer version used as a yard stick. This notwithstanding, performance remained poor for complexes with antibodies and nanobodies, where evolutionary relationships between the binding partners are lacking, and for complexes featuring conformational flexibility, clearly indicating that the prediction of protein complexes remains a challenging problem.
more » « less

Search for: All records